nr 1
Constrained and Robust Policy Synthesis with Satisfiability-Modulo-Probabilistic-Model-Checking
Heck, Linus, Macák, Filip, Češka, Milan, Junges, Sebastian
The ability to compute reward-optimal policies for given and known finite Markov decision processes (MDPs) underpins a variety of applications across planning, controller synthesis, and verification. However, we often want policies (1) to be robust, i.e., they perform well on perturbations of the MDP and (2) to satisfy additional structural constraints regarding, e.g., their representation or implementation cost. Computing such robust and constrained policies is indeed computationally more challenging. This paper contributes the first approach to effectively compute robust policies subject to arbitrary structural constraints using a flexible and efficient framework. We achieve flexibility by allowing to express our constraints in a first-order theory over a set of MDPs, while the root for our efficiency lies in the tight integration of satisfi-ability solvers to handle the combinatorial nature of the problem and probabilistic model checking algorithms to handle the analysis of MDPs. Experiments on a few hundred benchmarks demonstrate the feasibility for constrained and robust policy synthesis and the competitiveness with state-of-the-art methods for various fragments of the problem.
Finite Sample Analysis of Distribution-Free Confidence Ellipsoids for Linear Regression
Szentpéteri, Szabolcs, Csáji, Balázs Csanád
The least squares (LS) estimate is the archetypical solution of linear regression problems. The asymptotic Gaussianity of the scaled LS error is often used to construct approximate confidence ellipsoids around the LS estimate, however, for finite samples these ellipsoids do not come with strict guarantees, unless some strong assumptions are made on the noise distributions. The paper studies the distribution-free Sign-Perturbed Sums (SPS) ellipsoidal outer approximation (EOA) algorithm which can construct non-asymptotically guaranteed confidence ellipsoids under mild assumptions, such as independent and symmetric noise terms. These ellipsoids have the same center and orientation as the classical asymptotic ellipsoids, only their radii are different, which radii can be computed by convex optimization. Here, we establish high probability non-asymptotic upper bounds for the sizes of SPS outer ellipsoids for linear regression problems and show that the volumes of these ellipsoids decrease at the optimal rate. Finally, the difference between our theoretical bounds and the empirical sizes of the regions are investigated experimentally.
Compact Autoregressive Network
Wang, Di, Huang, Feiqing, Zhao, Jingyu, Li, Guodong, Tian, Guangjian
Recurrent neural networks (RNN) and their variants, such as Long-Short Term Memory (Hochreiter and Schmidhuber, 1997) and Gated Recurrent Unit (Cho et al., 2014), are commonly used as the default architecture or even the synonym of sequence modeling by deep learning practitioners (Goodfellow et al., 2016). In the meanwhile, especially for high-dimensional time series, we may also consider the autoregressive modeling or multi-task learning, null y t f (y t 1, y t 2,..., y t P), (1) where the output null y t and each input y t i are N -dimensional, and the lag P can be very large for accomodating sequential dependence. Some non-recurrent feed-forward networks with convolutional or other certain architectures have been proposed recently for sequence modeling, and are shown to have state-of-the-art accuracy. For example, some autoregressive networks, such as PixelCNN (Van den Oord et al., 2016b) and WaveNet (Van den Oord et al., 2016a) for image and audio sequence modeling, are compelling alternatives to the recurrent networks. This paper aims at the autoregressive model (1) with a large number of sequences. This problem can be implemented by a fully connected network with NP inputs and N outputs.